Verb Detection in Persian Corpus

نویسندگان

  • Majid Iranpour Mobarakeh
  • Behrouz Minaei-Bidgoli
چکیده

A novel technique is introduced for verb and inflection detection in Persian texts. This recognition can be useful for preprocessing phase in natural language processing (NLP) and text mining like partof-speech (POS) tagging and sentence boundary detection (SBD) in Persian texts. Our technique employs structural information of Persian verb for the first phase of this detection and then uses the n-gram approach for Homograph Disambiguation in order to increase the performance as the second phase. Experimental results show that our technique can achieve high efficiency performance (99%) which is an exemplar solution for Persian SDB and POS tagging problem domain.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Statistical Approach to Persian Light Verb Constructions

This article presents the linguistic bases of Persian light verb constructions and shows the corpus based construction of lists of collocates for some common Persian verbs. The proposed methods of corpus construction are language independent and the good results on a relatively small corpus of 20 million words confirms the power of association measures based on the hypergeometric distribution. ...

متن کامل

Supervised Morphology Generation Using Parallel Corpus

Translating from English, a morphologically poor language, into morphologically rich languages such as Persian comes with many challenges. In this paper, we present an approach to rich morphology prediction using a parallel corpus. We focus on the verb conjugation as the most important and problematic phenomenon in the context of morphology in Persian. We define a set of linguistic features usi...

متن کامل

A corpus-based translation study on English-Persian verb phrase ellipsis

The present research is a descriptive corpus-based translation study aiming at pinpointing the patterns of translation into Persian when dealing with English Verb Phrase Ellipsis (VPE). After scrutiny of the strategies applied by Persian translators some regular patterns were drawn, with the exception that the observed translation behavior may be taken as advantageous information for improving ...

متن کامل

Collocational Clashes in the Persian Translations of Tuesdays with Morrie

This study aimed at finding features of collocational deviations in the translations of Tuesdays with Mor- rie. In this direction, categories of collocations and collocational clashes, as well as causes of collocation- al clashes were explored. The present work investigated five Persian translations of the novel. All the books were examined completely and all possible collocational clashes were...

متن کامل

Developing Monolingual Persian Corpus for Extrinsic Plagiarism Detection Using Artificial Obfuscation: Notebook for PAN at CLEF 2015

The task of text alignment corpus construction at PAN 2015 competition consists of preparing a plagiarism corpus so that it can provide various obfuscation types and versatile obfuscation degrees. Meanwhile, its format and metadata structure should follow previous PAN plagiarism corpora. In this paper, we describe our approach for construction of a monolingual Persian plagiarism corpus that can...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JDCTA

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2009